Assessing learners’ writing using lexical frequency
نویسندگان
چکیده
In this work we set out to investigate the feasibility of applying measures of lexical frequency to the assessment of the writing of learners of French. A system developed for analysing the lexical knowledge of learners, according to their productive use of high and low frequency words (Laufer and Nation 1995), was adapted for French and used to analyse learners’ texts from an Open University French course. Whilst we found that this analysis could not be said to reflect the state of the learners’ vocabulary knowledge in the same way that Laufer and Nation’s study did, elements of the system's output did correlate significantly with scores awarded by human markers for vocabulary use in these texts. This suggests that the approach could be used for self-assessment. However, the feedback that can be given to learners on the basis of the current analysis is very limited. Nevertheless, the approach has the potential for refinement and when enhanced with information derived from successive cohorts of learners performing similar writing tasks, could be a first step in the development of a viable aid for learners evaluating their own writing. 1 Automatic text analysis Technologies for giving automatic feedback to learners are of particular interest to providers of large scale language courses at a distance, such as the Open University. It is seen a way to support students’ motivation and the development of study skills, without increasing the workload on tutors or the financial cost to the institution. Unfortunately, the quality of currently available automatic feedback on written language use is rather low, due to the technical difficulty of text analysis, and a lack of conviction in the CALL R. Goodfellow et al. 134 community of its usefulness! (Ironically, the level of work going on in the technically even more difficult area of speech analysis, is much higher (see Ehsani and Knodt, 1998). The most commonly-used applications of text analysis are spell-checking, grammarchecking, and style and usage checking. These are now widely available and are useful tools. They are, however, very limited in their suitability for automatic feedback, because they focus on short (word or phrase) segments of language, which are analysed in isolation. Pennington (1992) has criticised the feedback such tools give as ‘out of context’, and ‘arbitrary’ in their decisions about style and readability. As a way of distinguishing between levels of language knowledge or competency they are even less appropriate. Similar criticisms can be levelled at syntactical parsers, although they have interested CALL researchers for some time. Whilst many interesting and ingenious prototypes have been developed, mainly in languages with a high degree of regularity, such as German, their application has remained focused on the analysis of individual errors, and their use restricted to the research lab (e.g. Vandeventer, 2001). Parsing free text for meaningful feedback is still a very hard problem, and more recent developments in automatic text analysis have tended to look at more statistical approaches. An example is Latent Semantic Analysis (see Folz et al., 1999). This is a method of comparing texts with ‘models’ representing the genre they belong to. Whilst it has proved a feasible way of automatically spotting non-standard student writing in subject areas such as psychology, it has not yet been applied to language learners’ texts, nor to the provision of feedback to the students themselves, rather than to the people who are marking the texts. In any case it is computationally quite complex and would demand a greater level of resource to investigate than is available to most university language departments. It seems clear that technologies for providing feedback on accuracy or meaning in learners’ texts are complex, and still fall far short of what is needed if those learners are to make use of the feedback to improve their writing. Our approach in the work described here has been to focus our attention instead on the area of vocabulary – the individual lexical items that learners use – and to take a relatively simple process of automatic analysis which has been shown to be a reliable measure of knowledge in one context, and try to adapt it to the requirements of assessment in another. The process is known as the Lexical Frequency Profile (LFP). 2 Lexical Frequency Profile (LFP) 79.9% of written English uses only the first 2000 most-frequent words in the language (Laufer, 1999). Knowledge of these 2000 most-frequent words plus the 570 most-frequent ‘academic’ words is considered ‘critical for academic success’ (Beglar, 1999). Because most L2 words are learned incidentally (i.e. through reading and listening rather than through specific vocabulary-learning exercises) we can assume that a learner’s vocabulary builds up in layers made up of words having similar frequencies. We could expect vocabulary knowledge at an early stage of development to consist mainly of high frequency words, and at a later stage to have a higher proportion of low frequency words. The lexical frequency profile method of assessing vocabulary knowledge by analysing learners’ texts was developed by Laufer and Nation (1995). They developed a procedure Assessing learners’ writing 135 which categorises the words in a learner’s text according to which frequency band each word belongs to: first 1000 most-frequent, second 1000 most-frequent, 570 most-frequent ‘academic’ words. Academic words are defined in the University Word List – 836 word families containing vocabulary that is not in the first 2000 words of English, but which is frequent and has a wide range across a variety of written academic texts from a variety of disciplines (Xue and Nation, 1984). They called this analysis the lexical frequency profile (LFP) of the text. The LFP analyser program, (now renamed RANGE), can be downloaded from Paul Nation’s web site: http://www.vuw.ac.nz/lals/staff/ paul_nation/index.html. The program shows the numbers and percentages of words and word families in a target English text coming from each of the three word lists, plus those which are not recognised (see Table 1). In Table 1, all the words in a sample text have been classified into categories of frequency (word list one is the first 1000 most-frequent words in English, column B row 2 shows the number and percentage of words in the text that come from that list etc.). The program has also performed a type and token analysis. A token is any occurrence of a word form in the text, regardless of whether it is occurring for the first or the nth time. A type is any word form which occurs once, regardless of how many more times it might occur. Both numbers and percentages of occurrences are given. A word family is the base form of a word, such as might appear as a headword in a dictionary, plus the derived and inflected forms of it. For the LFP word families Laufer and Nation included all the derivations and inflections listed at ‘level 3’ of the scale devised in Bauer and Nation (1993), that is: -able, -er, -ish, -less, -ly, -ness, -th, -y, non-, and un-. The LFP program is not able to classify any words that do not appear in the three frequency lists into their word families (hence the question marks in the fifth row of column D). Laufer and Nation showed that the LFP measure of learners’ texts can be compared with scores that the same learners achieve on standard vocabulary tests. They found that there is a correlation between performance on vocabulary tests and the proportions of low and high-frequency words in the free-written texts. They give the following results for correlation between the use that their English learners at the University of Haifa made of high and low frequency word families, and their scores in a vocabulary-based ‘levels’ test (see Table 2). In Table 2, the negative correlations at the bottom of column B show that learners who used higher proportions of high-frequency words in their texts scored lower in the vocabulary test, and vice versa. The positive correlations in column D show that learners who used higher proportions of academic words in their text also scored higher in the vocabulary test. Similarly for column E, which deals with words that were not in the first three A. Word List B. Types/% C. Tokens/% D. Families one 54/72.0 34/69.4 33 two 2/ 2.7 2/ 4.1 2 three 14/18.7 9/18.4 9 not in the lists 5/ 6.7 4/ 8.2 ????? Total 75 49 44 Table 1. Sample output from the LFP program R. Goodfellow et al. 136 lists and are therefore by definition low frequency. Laufer and Nation conclude that use of low frequency words is an indicator of richness in a learner’s vocabulary, and recommend this procedure as a stable and reliable measure of lexical use in writing. Laufer and Nation’s test procedures were carefully controlled and their subjects were mainly students who had a common educational background in the Israeli school system. They were therefore able to make a strong case for the usefulness of the LFP measure for curriculum-design purposes. Our interest in it, on the other hand, was less experimental and more opportunistic, as we saw in it a potential source of automatic feedback to distance learners on the quality of the texts they submit for assessment. Whilst the LFP focuses only on vocabulary, we assumed that the learner’s use of vocabulary would be an important determinant of the overall quality of their text (Laufer and Nation report in the same paper on two studies which found correlations between lexical measures and more holistic measures of quality in written text). If the LFP was capable of providing a reliable measure of the learner’s lexical knowledge as reflected in a text, in the way that Laufer and Nation’s study suggested, then we hypothesised that its analysis might bear some relation to the scores that human markers gave the same text, especially where they were marking specifically for vocabulary use. We saw in this hypothethised relation the potential to give a learner some indication of the kind of mark they might get for a free-writing assignment, before it was marked. Feedback of this type, we believed, would be useful in a formative way, giving the learner a focus for reflection on their work as well as an opportunity to improve it before submission. A study was set up to determine whether an LFP measure did in fact correspond to tutor marks for a group of assignments on one of the OU’s French courses. 3 Comparing the LFP with tutor marks The testbed we chose for the study was the OU Level 1 French Course L120. The reasons for choosing this course, given that the OU does not have an English programme which would have enabled us to use the Laufer and Nation system more or less exactly as they did, were: • The level is appropriate (low intermediate) A B C D E % 1st 1000 % 2nd 1000 % Academic % word families (high frequency) (medium frequency) (low frequency) not in the other 3 word families word families word families lists (low frequency) Text1 Text2 Text1 Text2 Text1 Text2 Text1 Text2 Levels Test/LFP –0.7 –0.7 0.01 0.2 0.7 0.6 0.6 0.8 Significance 0.0001 0.0001 0.9 0.3 0.0001 0.001 0.0002 0.0001 Table 2. % of word families from each frequency band correlated against level test scores (N = 65). (Data re-presented from Laufer and Nation, 1995, op cit., p.317) Assessing learners’ writing 137 • The course has at least one tutor-marked assignment which is graded under four criteria, one of which is explicitly vocabulary-related. • Because of the amount of work going on in French lexicography it is feasible that word frequency lists could be found or developed for this language. Once the French LFP system was built we proposed to test it in two ways. Firstly by comparing its analysis of a number of L120 tutor-marked assignments (TMAs) with the marks given by the course tutors under the vocabulary criterion. Secondly, the system would be evaluated qualitatively by learners and teachers, to establish the optimal form in which feedback on a text should be given, in order to help a learner to benefit from it. In the event, the second part of the evaluation has not yet been carried out, and this paper focuses only on the results of the first. 3.1 Creating the word lists In adapting the LFP program for French texts it was found necessary to create the French word-frequency lists from scratch, as no suitable equivalent already existed. The general lists (first 1000 and second 1000 most frequent words) were extracted from word lists developed and lemmatised (categorised into word families) at the Catholic University of Leuven, as part of the Eurolex project (Verlinde and Selva, 2001) from a corpus of texts from Le Monde and Le Soir. The academic list was extracted from the ELRA Parole French corpus (available for purchase from the European Language Resources Association at http://www.elda.fr), and lemmatised by one of the current authors. His report on some of the feasibility considerations relating to this work is available at http://iet.open.ac.uk/pp/r.goodfellow/ltic/report1.htm 3.2 The study procedure For the comparison we transcribed 36 student essays which had been submitted and marked during a recent presentation of the L120 course, and submitted them to the French LFP program for analysis. We then searched for correlations between key aspects of the lexical profile for each text, and the marks awarded for grammatical accuracy and vocabulary range The main differences between our procedure and Laufer and Nation’s were as follows: • They used specially-written texts – they were on 'essay/discussion' topics such as “Should a government be allowed to limit the number of children a family can have?” or “A person cannot be poor and happy..”. Our project selected from texts submitted for assignment by learners on L120 – all were on the same topic, a ‘journalistic’ account of the life of fire-fighters in Quebec. • Laufer and Nation had all their texts ‘corrected’ by hand prior to processing. Obviously incorrect words were deleted, misspelled words were corrected, proper nouns were deleted. We wanted to limit human intervention as far as possible, but on the assumption that learners able to use any feedback system based on this analysis would also be able to use a French spellchecker, all texts were spellchecked and where obvious corrections were suggested these were accepted, R. Goodfellow et al. 138 but where appropriate corrections were not obvious or not suggested the word was deleted. Two proper nouns that occurred in most of the texts were deleted. • Laufer and Nation may have done some manual post-processing of the LFP output. This is not acknowledged in the 1995 paper, but can be inferred from the fact that they report figures which include word ‘families’ not found in the frequency lists. As the analyser is not able to categorise words which do not appear in the lists it is assumed that they had the ‘not-in-a-list’ words assigned to families manually. Our analysis does not use the category word ‘family’ for these unrecognised words, but instead uses word ‘type’. • Where they compared their LFP analysis of students’ texts with results in vocabulary tests, we compared LFP analysis of the L120 student texts with the marks the tutors had given. Each tutor had given a mark out of 25 for each of four criteria: two ‘content-related’ criteria, one ‘accuracy’ criterion and one 'vocabulary range' criterion. 3.3 Discussion of first results The initial comparison (see Table 3) did not produce the same kinds of correlation between the LFP analysis and the tutors’ marks as Laufer and Nation found between LFP and vocabulary test scores. In Table 3, the correlations are neither as strong as Laufer and Nation found, nor do they occur in the same areas of the data. Weak negative correlations (p = 0.05) exist between the use of high frequency word families and marks for range and accuracy (column B), where Laufer and Nation found strong ones, and there is no correlation at all between use of academic words (column D) or ‘not-in-a-list’ word types (column E) and the tutor marks. On the other hand, medium strength correlations (p = 0.01) were found between use of medium frequency word families (column C) and the range and accuracy marks, whereas Laufer and Nation found no correlation at this level of frequency. These differences in strength between Laufer and Nation’s correlations and ours might be explained by the less-controlled conditions of our study. The L120 adult distance learners are likely to have been more varied individually in age and background (43 of Laufer and Nation’s subjects were recent graduates from the Israeli school system and had passed the same entrance exam). The tutors’ marks against which the L120 LFP scores were correlated were produced by four different tutors and had not been standard-
منابع مشابه
Automatically Assessing Lexical Sophistication: Indices, Tools, Findings, and Application
This study explores the construct of lexical sophistication and its applications for measuring second language lexical and speaking proficiency. In doing so, the study introduces the Tool for the Automatic Analysis of LExical Sophistication (TAALES), which calculates text scores for 135 classic and newly developed lexical indices related to word frequency, range, bigram and trigram frequency, a...
متن کاملLexical and Grammatical Collocations in Writing Production of EFL Learners
Lewis (1993) recognized significance of word combinations including collocations by presenting lexical approach. Because of the crucial role of collocation in vocabulary acquisition, this research set out to evaluate the rate of collocations in Iranian EFL learners' writing production across L1 and L2. In addition, L1 interference with L2 collocational use in the learner' writing samples was st...
متن کاملThe Impact of Teaching Corpus-based Collocation on EFL Learners' Writing Ability
Abstract The present study explores the impact of corpus-based collocation instruction on intermediate Iranian EFL learners' writing ability. For this study, 84 Iranian learners, studying English as a foreign language in Bayan Institute, Iran, were selected and were randomly divided into two groups, experimental and control. Conventional methods of writing instruction were taught to the control...
متن کاملThe Impact of Teaching Corpus-based Collocation on EFL Learners' Writing Ability
Abstract The present study explores the impact of corpus-based collocation instruction on intermediate Iranian EFL learners' writing ability. For this study, 84 Iranian learners, studying English as a foreign language in Bayan Institute, Iran, were selected and were randomly divided into two groups, experimental and control. Conventional methods of writing instruction were taught to the control...
متن کاملScaffolding Advanced Writing through Writing Frames
Mastering writing has always proved an almost insurmountable barrier to EFL learners. In an attempt to alleviate problems advanced EFL learners have with writing, this study aimed at investigating the effect of scaffolded instruction through writing frames constructed from extended prefabricated lexical bundles. 40 female advanced English students, selected out of a population of 65, were rando...
متن کاملCognitive Task Complexity and Iranian EFL Learners’ Written Linguistic Performance across Writing Proficiency Levels
Recently tasks, as the basic units of syllabi, and the cognitive complexity, as the criterion for sequencing them, have caught many second language researchers’ attention. This study sought to explore the effect of utilizing the cognitively simple and complex tasks on high- and low-proficient EFL Iranian writers’ linguistic performance, i.e., fluency, accuracy, lexical complexity, and structura...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016